Optimizing Document Classification: Unleashing the Power of Genetic Algorithms

نویسندگان

چکیده

Many individuals, including researchers, professors, and students, encounter difficulties when searching for scholarly documents, papers, journals within a specific domain. Consequently, scholars have begun to focus on document classification problem, offering various methods address this issue. Researchers utilized diverse data sources, such as citations, metadata, content, hybrids, in their approaches.In these the meta-data-based approach stands out research paper due its availability at no cost. Various employed different metadata parameters of articles, title, abstract, keywords, general terms, classification. In study, we chose four features as, keyword, terms from SANTOS dataset, which was prepared by ACM. To represent numerically, semantic-based model called BERT instead commonly used count-based models. generates 768-dimensional vector each record, introduces significant time complexity during computation. Additionally, our proposed optimizes using genetic algorithm. Optimal feature selection performances crucial role domain, enhancing overall accuracy system while reducing associated with selecting most relevant large-dimensional space. For purposes, GNB SVM classifiers. The outcomes study exposed that combination title keywords outperformed other combinations.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

analysis of power in the network society

اندیشمندان و صاحب نظران علوم اجتماعی بر این باورند که مرحله تازه ای در تاریخ جوامع بشری اغاز شده است. ویژگیهای این جامعه نو را می توان پدیده هایی از جمله اقتصاد اطلاعاتی جهانی ، هندسه متغیر شبکه ای، فرهنگ مجاز واقعی ، توسعه حیرت انگیز فناوری های دیجیتال، خدمات پیوسته و نیز فشردگی زمان و مکان برشمرد. از سوی دیگر قدرت به عنوان موضوع اصلی علم سیاست جایگاه مهمی در روابط انسانی دارد، قدرت و بازتولید...

15 صفحه اول

Optimizing the Pre-Processing Phase of Automatic e-Document Classification

Electronic documents such as e-catalogs, e-mails, and Web documents have their own distinct characteristics that can be utilized in search and classification. They are structured, noisy, and, in some cases, related to each other. We analyze the characteristics of three major types of e-documents e-catalogs, e-mails, and Web documents and propose methods for optimizing automatic classification o...

متن کامل

CEEDs: Unleashing the Power of the Subconscious

The Collective Experience of Empathic Data Systems (CEEDs) project aims to offer a solution to the data deluge problem. With theoretical foundations in consciousness, information processing and creative discovery, the project proposes to develop a data analysis tool that harnesses and interprets the unconscious processes that influence our understanding of the world. Implicit reactions to immer...

متن کامل

Infotopia: Unleashing the Democratic Power of Transparency*

In Infotopia, citizens enjoy a wide range of information about the organizations upon which they rely for the satisfaction of their vital interests. The provision of that information is governed by principles of democratic transparency. Democratic transparency both extends and critiques current enthusiasms about transparency. It urges us to conceptualize information politically, as a resource t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Access

سال: 2023

ISSN: ['2169-3536']

DOI: https://doi.org/10.1109/access.2023.3292248